Notes (Week 3 Wednesday)
These are the notes I wrote during the lecture.
This week is *recursion* and *induction*
(you can use these words mostly
interchangably)
- Recursive function definitions
f(0) = 1 (B)
f(n) = n + f(n-1) if n > 0 (R)
They're called *recursive* because the
symbol we're defining (here f)
may occur on the RHS of the defining
equations.
*But* the occurences on the RHS must be
for *smaller values* than on the LHS.
Otherwise, the recursion doesn't
bottom out.
In math, recursion that doesn't bottom
out may lead to incomplete or unsound
definitions (don't do it!)
In programming, you'll run forever.
- Inductive definitions of relations
Let ℕ be inductively defined by the rules:
(B) 0 ∈ ℕ
(I) If n ∈ ℕ then n+1 ∈ ℕ
We take ℕ as the *smallest* set about
which the rules (B) and (I) are valid.
Many sets contain 0 and the successor
of everything in it, i.e. ℝ would also
satisfy rules (B) and (I).
The reason such inductive definitions
work is that there's always a unique
least set that satisfies such
sets of rules.
Namely: the intersection of all of them.
This will be the same thing as:
the set of everything you can
build by starting from nothing
and using rules (B) and (I) finitely
many times.
This lecture: induction.
We started on Monday with
natural number induction
Suppose I have a proof using induction from
m upwards.
∀n ≥ m. P(n)
by proving
P(m)
and
∀k≥m. P(k) ⇒ P(k+1)
I could have done exactly the same thing using
basic induction to prove:
∀n∈ℕ. P'(n)
P'(n) ≡ P(n+m)
Basic induction then requires me to prove
(B) P'(0) which is P(m) which I already proved above
(I) ∀k. P'(k) ⇒ P'(k+1)
...which is
∀k. P(k+m) ⇒ P'(k+m1)
...which is equivalent to
∀k≥m. P(k) ⇒ P'(k+1)
...which I already proved above.
Q: How to say that P(x) is true in mathematical symbols?
A: ⊧ P(x) (coming in the near future)
Suppose we have a proof of:
P(n) for every l:th n≥m
using induction with "big steps"
as on slide 49.
Then we *could* replay the proof
by proving the following using
basic induction:
∀n ∈ ℕ. P'(n)
where P'(n) ≡ P(n*l+m)
F(k+4) =
F(k+3) + F(k+2) =
2*F(k+2) + F(k+1) =
2*(F(k+1) + F(k)) =
3*F(k+1) + F(k)
We need that
3 | 3*F(k+1) + F(k)
Follows from the IH
that 3 | F(k)
We want to prove
∀n ∈ ℕ. P(n)
Strong induction then requires us to prove
(I) ∀k∈ℕ. (∀m ∈ ℕ. m < k ⇒ P(m)) ⇒ P(k)
^ We need to prove P(k) from the assumption that P holds about
every number smaller than k.
Q: Why is there no base case here? Isn't that spooky?
A: Actually, there is one!
It's (I), but when k is 0.
(∀m ∈ ℕ. m < 0 ⇒ P(m)) ⇒ P(0)
≡
(∀m ∈ ℕ. ⊥ ⇒ P(m)) ⇒ P(0) ⊥ means "false"
≡
(∀m ∈ ℕ. ⊤) ⇒ P(0) ⊤ means "true"
≡
⊤ ⇒ P(0)
≡
P(0)
It's not that the base case isn't in there.
It's just that we don't need to treat it separately.
⊥ it's pronounced "bottom"
⊤ it's pronounced "top"
What we've called "basic induction"
is a special case of structural induction.
Specifically:
basic induction is
structural induction over the ℕs
≺ is
*reflexive* if
∀x. x ≺ x
*anti-reflexive* if
∀x. ¬ (x ≺ x)
*non-reflexive* if
∃x. ¬ (x ≺ x)
(ℕ,<) is a strict poset.
It's also well-founded,
because if we choose your favourite number
(say 6)
0 .. < 3 < 4 < 6
^ there's no way we can make a chain
that descends forever along <
(ℤ,<)
not a wfo because we can make
infinitely descending chains
... < -126216 < -1 < 0 < 6
(ℝ+ ∪ {0},<)
is *not* well-founded,
despite being a poset,
and despite having a bottom element
.. < 1/4 < 1/2 < 1
Well-founded relations are closely related to
termination. (they're in fact the same thing!)
Suppose we have a poset
(S, ≺)
where S is the *state space* of a computer program
(the set of all states that could occur during
the execution of a program)
s' ≺ s means
"starting from the state s, it's possible
to reach the state s' by executing the program
a little bit"
Suppose we had an infinitely descending chain along ≺.
Q: What does that say about the program?
A: The program might not terminate!
Suppose we know that (S, ≺) is a well-founded order.
Q: What does that say about the program?
A: It always terminates.
If every chain in S (aka every possible execution sequence)
always bottoms out in a state where we can't keep computing.
That means we *must* terminate!
Hence: the minimal elements of the poset are the final states
(halting states)
In other words: the class of programs we can build such a wfo on
is the class of programs that's guaranteed to terminate
Well-founded induction.
Because of antisymmetry, we can't have programs that revisit the
exact same state ordered by a strict relation.
Hence: not all programs would be captured by the above.
But there's two ways around this:
- we don't care because we were specifically trying to describe the terminating programs
- we could chuck a counter onto our state that increments with every step
Theorem (Emmy Noether? Or folk?)
The following three statements are equivalent:
1. (S,≺) is a strict poset with no infinite descending descending chains
2. (S,≺) is a strict poset where every non-empty subset of S has a minimal element
3. The following induction principle is valid:
(well-founded induction or Noetherian induction)
To prove ∀s ∈ S. P(s)
It suffices to prove
∀s ∈ S. (∀s' ∈ S. s' ≺ s ⇒ P(s')) ⇒ P(s)
In other words:
(1) "it terminates"
is equivalent to
(3) "we can use induction to reason about it"
Complete induction on ℕ
is the same thing as
well-founded induction on (ℕ,<)
We could use this to derive another induction principle for ℤ.
(ℤ,≺)
Where x ≺ y ≡ |x| < |y| |x| means "absolute value of"
...this is a WFO (follows from the fact that (ℕ,<) is one)
Therefore: well-founded induction on this order gives us
an induction principle:
"induction on the distance from 0"
In programming:
if v = "he"
and w = "llo"
v + w = "hello"
In formal languages
v = he
w = llo
vw = hello or sometimes v⬝w = hello
(he)⬝(llo)
= (concat.I)
h((e)⬝llo)
= (concat.I)
h(e((λ)⬝llo))
= (concat.B)
h(e(llo))
= by associativity
hello
Q: Do the definitions have to
correspond to the inductive definition
of strings?
A: You'll notice that we used exactly one
recursive equation per clause of the
definition of the set of strings.
We don't have to do this,
but it makes life easier for two reasons:
(1) a recursive function of this shape
must terminate.
(2) it's easier to do proofs about it
when our functions and our
induction rules have the same shape.
reverse(hello) = olleh
reverse(llo)⬝reverse(he) =
oll⬝eh = olleh
reverse(he)
= reverse(e)⬝h
= reverse(λ)⬝e⬝h
= λ⬝e⬝h
= ehe